Profile Guided Dataflow Transformation for FPGAs and CPUs

نویسندگان

  • Robert J. Stewart
  • Deepayan Bhowmik
  • Andrew M. Wallace
  • Greg J. Michaelson
چکیده

This paper proposes a new high-level approach for optimising field programmable gate array (FPGA) designs. FPGA designs are commonly implemented in low-level hardware description languages (HDLs), which lack the abstractions necessary for identifying opportunities for significant performance improvements. Using a computer vision case study, we show that modelling computation with dataflow abstractions enables substantial restructuring of FPGA designs before lowering to the HDL level, and also improve CPU performance. Using the CPU transformations, runtime is reduced by 43%. Using the FPGA transformations, clock frequency is increased from 67MHz to 110MHz. Our results outperform commercial low-level HDL optimisations, showcasing dataflow program abstraction as an amenable computation model for highly effective FPGA optimisation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dataflow computation with intelligent memories emulated on field-programmable gate arrays (FPGAs)

This paper presents a new design that implements the data-driven (i.e. dataflow) computation paradigm with intelligent memories. Also, a relevant prototype that employs FPGAs is presented for the support of intelligent memory structures. Instead of giving the CPU the privileged right to decide what instructions to fetch in each cycle (as is the case for control-flow CPUs), instructions in dataf...

متن کامل

High-performance computing using accelerators

A recent trend in high-performance computing is the development and use of heterogeneous architectures that combine fine-grain and coarse-grain parallelism using tens or hundreds of disparate processing cores. These processing cores are available as accelerators or many-core processors, which are designed with the goal of achieving higher parallel-code performance. This is in contrast with trad...

متن کامل

Synthesizing Hardware from Dataflow Programs

The MPEG Reconfigurable Video Coding working group is developing a new library-based process for building the reference codecs of future MPEG standards, which is based on dataflow and uses an actor language called Cal. The paper presents a code generator producing RTL targeting FPGAs for Cal, outlines its structure, and demonstrates its performance on an MPEG-4 Simple Profile decoder. The resul...

متن کامل

Achieving 10Gbps Line-rate Key-value Stores with FPGAs

Distributed in-memory key-value stores such as memcached have become a critical middleware application within current web infrastructure. However, typical x86based systems yield limited performance scalability and high power consumption as their architecture with its optimization for single thread performance is not wellmatched towards the memory-intensive and parallel nature of this applicatio...

متن کامل

A Dataflow IR for Memory Efficient RIPL Compilation to FPGAs

Field programmable gate arrays (FPGAs) are fundamentally different to fixed processors architectures because their memory hierarchies can be tailored to the needs of an algorithm. FPGA compilers for high level languages are not hindered by fixed memory hierarchies. The constraint when compiling to FPGAs is the availability of resources. In this paper we describe how the dataflow intermediary of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Signal Processing Systems

دوره 87  شماره 

صفحات  -

تاریخ انتشار 2017